{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## 5.1 \u8bfb\u5199\u6587\u672c\u6570\u636e\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### \u95ee\u9898\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u4f60\u9700\u8981\u8bfb\u5199\u5404\u79cd\u4e0d\u540c\u7f16\u7801\u7684\u6587\u672c\u6570\u636e\uff0c\u6bd4\u5982ASCII\uff0cUTF-8\u6216UTF-16\u7f16\u7801\u7b49\u3002"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### \u89e3\u51b3\u65b9\u6848\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u4f7f\u7528\u5e26\u6709 rt \u6a21\u5f0f\u7684 open() \u51fd\u6570\u8bfb\u53d6\u6587\u672c\u6587\u4ef6\u3002\u5982\u4e0b\u6240\u793a\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Read the entire file as a single string\nwith open('somefile.txt', 'rt') as f:\n    data = f.read()\n\n# Iterate over the lines of the file\nwith open('somefile.txt', 'rt') as f:\n    for line in f:\n        # process line\n        ..."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u7c7b\u4f3c\u7684\uff0c\u4e3a\u4e86\u5199\u5165\u4e00\u4e2a\u6587\u672c\u6587\u4ef6\uff0c\u4f7f\u7528\u5e26\u6709 wt \u6a21\u5f0f\u7684 open() \u51fd\u6570\uff0c\n\u5982\u679c\u4e4b\u524d\u6587\u4ef6\u5185\u5bb9\u5b58\u5728\u5219\u6e05\u9664\u5e76\u8986\u76d6\u6389\u3002\u5982\u4e0b\u6240\u793a\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Write chunks of text data\nwith open('somefile.txt', 'wt') as f:\n    f.write(text1)\n    f.write(text2)\n    ...\n\n# Redirected print statement\nwith open('somefile.txt', 'wt') as f:\n    print(line1, file=f)\n    print(line2, file=f)\n    ..."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u5982\u679c\u662f\u5728\u5df2\u5b58\u5728\u6587\u4ef6\u4e2d\u6dfb\u52a0\u5185\u5bb9\uff0c\u4f7f\u7528\u6a21\u5f0f\u4e3a at \u7684 open() \u51fd\u6570\u3002"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u6587\u4ef6\u7684\u8bfb\u5199\u64cd\u4f5c\u9ed8\u8ba4\u4f7f\u7528\u7cfb\u7edf\u7f16\u7801\uff0c\u53ef\u4ee5\u901a\u8fc7\u8c03\u7528 sys.getdefaultencoding() \u6765\u5f97\u5230\u3002\n\u5728\u5927\u591a\u6570\u673a\u5668\u4e0a\u9762\u90fd\u662futf-8\u7f16\u7801\u3002\u5982\u679c\u4f60\u5df2\u7ecf\u77e5\u9053\u4f60\u8981\u8bfb\u5199\u7684\u6587\u672c\u662f\u5176\u4ed6\u7f16\u7801\u65b9\u5f0f\uff0c\n\u90a3\u4e48\u53ef\u4ee5\u901a\u8fc7\u4f20\u9012\u4e00\u4e2a\u53ef\u9009\u7684 encoding \u53c2\u6570\u7ed9open()\u51fd\u6570\u3002\u5982\u4e0b\u6240\u793a\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "with open('somefile.txt', 'rt', encoding='latin-1') as f:\n    ..."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Python\u652f\u6301\u975e\u5e38\u591a\u7684\u6587\u672c\u7f16\u7801\u3002\u51e0\u4e2a\u5e38\u89c1\u7684\u7f16\u7801\u662fascii, latin-1, utf-8\u548cutf-16\u3002\n\u5728web\u5e94\u7528\u7a0b\u5e8f\u4e2d\u901a\u5e38\u90fd\u4f7f\u7528\u7684\u662fUTF-8\u3002\nascii\u5bf9\u5e94\u4eceU+0000\u5230U+007F\u8303\u56f4\u5185\u76847\u4f4d\u5b57\u7b26\u3002\nlatin-1\u662f\u5b57\u82820-255\u5230U+0000\u81f3U+00FF\u8303\u56f4\u5185Unicode\u5b57\u7b26\u7684\u76f4\u63a5\u6620\u5c04\u3002\n\u5f53\u8bfb\u53d6\u4e00\u4e2a\u672a\u77e5\u7f16\u7801\u7684\u6587\u672c\u65f6\u4f7f\u7528latin-1\u7f16\u7801\u6c38\u8fdc\u4e0d\u4f1a\u4ea7\u751f\u89e3\u7801\u9519\u8bef\u3002\n\u4f7f\u7528latin-1\u7f16\u7801\u8bfb\u53d6\u4e00\u4e2a\u6587\u4ef6\u7684\u65f6\u5019\u4e5f\u8bb8\u4e0d\u80fd\u4ea7\u751f\u5b8c\u5168\u6b63\u786e\u7684\u6587\u672c\u89e3\u7801\u6570\u636e\uff0c\n\u4f46\u662f\u5b83\u4e5f\u80fd\u4ece\u4e2d\u63d0\u53d6\u51fa\u8db3\u591f\u591a\u7684\u6709\u7528\u6570\u636e\u3002\u540c\u65f6\uff0c\u5982\u679c\u4f60\u4e4b\u540e\u5c06\u6570\u636e\u56de\u5199\u56de\u53bb\uff0c\u539f\u5148\u7684\u6570\u636e\u8fd8\u662f\u4f1a\u4fdd\u7559\u7684\u3002"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "### \u8ba8\u8bba\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u8bfb\u5199\u6587\u672c\u6587\u4ef6\u4e00\u822c\u6765\u8bb2\u662f\u6bd4\u8f83\u7b80\u5355\u7684\u3002\u4f46\u662f\u4e5f\u51e0\u70b9\u662f\u9700\u8981\u6ce8\u610f\u7684\u3002\n\u9996\u5148\uff0c\u5728\u4f8b\u5b50\u7a0b\u5e8f\u4e2d\u7684with\u8bed\u53e5\u7ed9\u88ab\u4f7f\u7528\u5230\u7684\u6587\u4ef6\u521b\u5efa\u4e86\u4e00\u4e2a\u4e0a\u4e0b\u6587\u73af\u5883\uff0c\n\u4f46 with \u63a7\u5236\u5757\u7ed3\u675f\u65f6\uff0c\u6587\u4ef6\u4f1a\u81ea\u52a8\u5173\u95ed\u3002\u4f60\u4e5f\u53ef\u4ee5\u4e0d\u4f7f\u7528 with \u8bed\u53e5\uff0c\u4f46\u662f\u8fd9\u65f6\u5019\u4f60\u5c31\u5fc5\u987b\u8bb0\u5f97\u624b\u52a8\u5173\u95ed\u6587\u4ef6\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "f = open('somefile.txt', 'rt')\ndata = f.read()\nf.close()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u53e6\u5916\u4e00\u4e2a\u95ee\u9898\u662f\u5173\u4e8e\u6362\u884c\u7b26\u7684\u8bc6\u522b\u95ee\u9898\uff0c\u5728Unix\u548cWindows\u4e2d\u662f\u4e0d\u4e00\u6837\u7684(\u5206\u522b\u662f \\n \u548c \\r\\n )\u3002\n\u9ed8\u8ba4\u60c5\u51b5\u4e0b\uff0cPython\u4f1a\u4ee5\u7edf\u4e00\u6a21\u5f0f\u5904\u7406\u6362\u884c\u7b26\u3002\n\u8fd9\u79cd\u6a21\u5f0f\u4e0b\uff0c\u5728\u8bfb\u53d6\u6587\u672c\u7684\u65f6\u5019\uff0cPython\u53ef\u4ee5\u8bc6\u522b\u6240\u6709\u7684\u666e\u901a\u6362\u884c\u7b26\u5e76\u5c06\u5176\u8f6c\u6362\u4e3a\u5355\u4e2a \\n \u5b57\u7b26\u3002\n\u7c7b\u4f3c\u7684\uff0c\u5728\u8f93\u51fa\u65f6\u4f1a\u5c06\u6362\u884c\u7b26 \\n \u8f6c\u6362\u4e3a\u7cfb\u7edf\u9ed8\u8ba4\u7684\u6362\u884c\u7b26\u3002\n\u5982\u679c\u4f60\u4e0d\u5e0c\u671b\u8fd9\u79cd\u9ed8\u8ba4\u7684\u5904\u7406\u65b9\u5f0f\uff0c\u53ef\u4ee5\u7ed9 open() \u51fd\u6570\u4f20\u5165\u53c2\u6570 newline='' \uff0c\u5c31\u50cf\u4e0b\u9762\u8fd9\u6837\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Read with disabled newline translation\nwith open('somefile.txt', 'rt', newline='') as f:\n    ..."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u4e3a\u4e86\u8bf4\u660e\u4e24\u8005\u4e4b\u95f4\u7684\u5dee\u5f02\uff0c\u4e0b\u9762\u6211\u5728Unix\u673a\u5668\u4e0a\u9762\u8bfb\u53d6\u4e00\u4e2aWindows\u4e0a\u9762\u7684\u6587\u672c\u6587\u4ef6\uff0c\u91cc\u9762\u7684\u5185\u5bb9\u662f hello world!\\r\\n \uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Newline translation enabled (the default)\nf = open('hello.txt', 'rt')\nf.read()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Newline translation disabled\ng = open('hello.txt', 'rt', newline='')\ng.read()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u6700\u540e\u4e00\u4e2a\u95ee\u9898\u5c31\u662f\u6587\u672c\u6587\u4ef6\u4e2d\u53ef\u80fd\u51fa\u73b0\u7684\u7f16\u7801\u9519\u8bef\u3002\n\u4f46\u4f60\u8bfb\u53d6\u6216\u8005\u5199\u5165\u4e00\u4e2a\u6587\u672c\u6587\u4ef6\u65f6\uff0c\u4f60\u53ef\u80fd\u4f1a\u9047\u5230\u4e00\u4e2a\u7f16\u7801\u6216\u8005\u89e3\u7801\u9519\u8bef\u3002\u6bd4\u5982\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "f = open('sample.txt', 'rt', encoding='ascii')\nf.read()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u5982\u679c\u51fa\u73b0\u8fd9\u4e2a\u9519\u8bef\uff0c\u901a\u5e38\u8868\u793a\u4f60\u8bfb\u53d6\u6587\u672c\u65f6\u6307\u5b9a\u7684\u7f16\u7801\u4e0d\u6b63\u786e\u3002\n\u4f60\u6700\u597d\u4ed4\u7ec6\u9605\u8bfb\u8bf4\u660e\u5e76\u786e\u8ba4\u4f60\u7684\u6587\u4ef6\u7f16\u7801\u662f\u6b63\u786e\u7684(\u6bd4\u5982\u4f7f\u7528UTF-8\u800c\u4e0d\u662fLatin-1\u7f16\u7801\u6216\u5176\u4ed6)\u3002\n\u5982\u679c\u7f16\u7801\u9519\u8bef\u8fd8\u662f\u5b58\u5728\u7684\u8bdd\uff0c\u4f60\u53ef\u4ee5\u7ed9 open() \u51fd\u6570\u4f20\u9012\u4e00\u4e2a\u53ef\u9009\u7684 errors \u53c2\u6570\u6765\u5904\u7406\u8fd9\u4e9b\u9519\u8bef\u3002\n\u4e0b\u9762\u662f\u4e00\u4e9b\u5904\u7406\u5e38\u89c1\u9519\u8bef\u7684\u65b9\u6cd5\uff1a"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Replace bad chars with Unicode U+fffd replacement char\nf = open('sample.txt', 'rt', encoding='ascii', errors='replace')\nf.read()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# Ignore bad chars entirely\ng = open('sample.txt', 'rt', encoding='ascii', errors='ignore')\ng.read()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\u5982\u679c\u4f60\u7ecf\u5e38\u4f7f\u7528 errors \u53c2\u6570\u6765\u5904\u7406\u7f16\u7801\u9519\u8bef\uff0c\u53ef\u80fd\u4f1a\u8ba9\u4f60\u7684\u751f\u6d3b\u53d8\u5f97\u5f88\u7cdf\u7cd5\u3002\n\u5bf9\u4e8e\u6587\u672c\u5904\u7406\u7684\u9996\u8981\u539f\u5219\u662f\u786e\u4fdd\u4f60\u603b\u662f\u4f7f\u7528\u7684\u662f\u6b63\u786e\u7f16\u7801\u3002\u5f53\u6a21\u68f1\u4e24\u53ef\u7684\u65f6\u5019\uff0c\u5c31\u4f7f\u7528\u9ed8\u8ba4\u7684\u8bbe\u7f6e(\u901a\u5e38\u90fd\u662fUTF-8)\u3002"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.1"
    },
    "toc": {
      "base_numbering": 1,
      "nav_menu": {},
      "number_sections": true,
      "sideBar": true,
      "skip_h1_title": true,
      "title_cell": "Table of Contents",
      "title_sidebar": "Contents",
      "toc_cell": false,
      "toc_position": {},
      "toc_section_display": true,
      "toc_window_display": true
    }
  },
  "nbformat": 4,
  "nbformat_minor": 2
}